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ABSTRACT 


Sports results forecast has became increasingly popular among the fans 
nowadays. It made predicting the outcome of a sport’s match, a new and 
interesting challenge. This paper presented a logic mining technique to model 
the results (Win Draw / Lose) of the football matches played in English 
Premier League, Spanish La Liga and France Ligue 1. In this research, 
a method namely & satisfiability based reverse analysis method (KSATRA) 
hybridized with Ant Colony Optimization (ACO) was brought forward to 
obtain the logical relationship among the clubs in these leagues. The logical 
rule obtained from the football matches was used to categorize the results of 
future matches. ACO is a population-based and nature-inspired algorithm to 
decipher several combinatorial optimization problems. KSATRA made use of 
the advantages of Hopfield Neural Network and k _ Satisfiability 
representation. The data set used in this study included the data of 6 clubs 
from each league, which composed of all league matches from year 2014 to 
2018. The effectiveness of KSATRA with ACO in obtaining logical rule in 


football matches was tested based on root mean square error (RMSE), mean 
absolute error (MAE), mean absolute percentage error (MAPE) and CPU 
time. Results acquired from the computer simulation showed the robustness 
of KSATRA in exhibiting the performance of the clubs. 
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1, INTRODUCTION 

Football is a popular sport where two teams consists of eleven players each compete in. In football, 
a large amount of data can be collected for each player, club, match and season. Many football clubs, coaches 
and league have begun to gain interest in analysing these data to help assess their players, improve game 
strategies and even predict match results. Match results prediction of any sports is naturally one of the most 
obvious objectives in sports analytics. There were many researches on match predictions in football. 
However, most of those researches were focused on predicting the results in terms of win, loss or draw and 
final score of goals. For example, Tsakonas et al. [1] proposed soft computing methods to predict the result 
of a football match in regard to neural networks, fuzzy rules and genetic programming approach. On the 
other hand, Baio and Blangiardo [2] proposed a Bayesian hierarchical model to predict the final score of a 
football match. Nunes and Sousa [3] successfully applied data mining techniques to two football data sets 
and with classification, they created a model which provided better results than a pure probabilistic classifier. 
Cintia et al. [4] outlined the performance of a football team in a game by 3 network indicators. 
They observed that these indicators correspond with the teams’ success and their approach also outperformed 
other two models in predicting the outcomes of long-running competitions. Chen et al. [5] proposed a 
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decision tree-based multi-modal data mining framework for soccer goal detection. The experimental results 
have shown that the integration of data mining and multi-modal processing of video was a feasible approach 
to effectively extract soccer goal events. 

Match results prediction can be a really challenging task due to the sport’s stochastic nature. 
Prediction of match outcomes is done before the game starts so the team managers, coaches and players are 
able to predict the possibilities of winning, draw or losing the game. In team sports, there are two types of 
analysis that can be conducted: a) individual and b) team analysis. In terms of analysing team performance, 
current methods [6,7] are mostly to predict the results or number of goals of a certain match. However, such 
methods has been reported only for special case or event. The method does not provide the generality of the 
relationship among clubs in the league. Several methods require unnecessary mathematical complexity such 
as multiple assumptions that disrupt the actual objective of the method. Unfortunately, these assumption and 
complexity have been proven to cause overfitting in data mining. In this paper, we will employ the 2 
Satisfiability based Reverse Analysis method to induce the best logical rule that shows the trend of results 
among 6 clubs in 3 different football leagues, namely English Premier League (EPL), Spanish La Liga (SLL) 
and France Ligue | (FL1). Through the relationship among the clubs in the league, the team managers, 
coaches and players of a certain club will be able to forecast the results of the match by taking the results of 
another club into consideration. Positive outcome prediction will definitely raise the players’ spirit. 
A negative outcome does not necessarily lower the players’ spirit but serve as a guidance for them to play 
exceptionally cautious and possibly come up with a counter strategy. 

Swarm intelligence (SI) inspired by optimization have been widely sought after during the last 
decade. Ant Colony Optimization (ACO) was proposed as a method to solve hard combinatorial optimization 
problems [8]. By leaving behind a trail of pheromone, ants are able to locate the shortest trail connecting their 
nest and the food source. Zhang and Crossley [9] proposed that ACO can be utilized to effectively solve 
optimization problems and ACO produces optimum solution. Zangari et al. [10] have confirmed that binary 
ACO can achieve moderate results by using a fair number of fitness evaluation. Wang et al. [11] have also 
shown that binary ACO is useful and adaptive in remote sensing image classification. ACO can present a 
superior performance compared to other algorithms in terms of fitness value. 

Artificial neural network (ANN) learns from the biological nervous system of human beings, 
for example how information is processed by the brain [12]. Hopfield Neural Network (HNN) is one of the 
well-known network implemented to solve various optimization problems [13]. HNN shows outstanding 
learning behaviour. For example, productive learning and retrieval operation. Traditional HNN 1s susceptible 
to a few deficiencies [14], so logic programming is embedded to HNN as a single intelligent unit [15]. 

Logic mining in HNN was proposed by Sathasivam [16] by applying Reverse Analysis method. 
This method is able to obtain the logical rule among neurons. Mean field theory applied to perform logic 
programming in HNN has proven to be fruitful in accelerating the computational ability of neuro symbolic 
integration by Velavan et al. [17]. 2 Satisfiability (2SAT) was discovered to enhance the representation of 
general SAT itself [18]. This makes 2SAT a suitable approach to represent logical rules in neural network. 
By considering only 2 literals per clause, the logical complexity in learning the relationship between the 
variables in real life problem decreases. By hybridizing Reverse Analysis, 2SAT and ACO, a new method, 
2 Satisfiability based Reverse Analysis method (2SATRA) with ACO will be utilized to obtain the logical 
rule of football matches. 


2. SATISFIABILITY REPRESENTATION 

2 Satisfiability (2SAT) 1s a logical rule that comprises of only 2 literals per clause. 2SAT is usually 
expressed as Boolean formulas called Conjunctive Normal Form (CNF) or Krom formulas. 2SAT consists of 
three components [19]: 
a) <A set of x variables, v,,V,,......,V 
b) A set of literals. A literal can be any variable or a negation of any variable. 
c) A set of y definite clauses, C,,C,,C;,......,C, linked by logical AND (A ). Each clause comprises of 

strictly 2 literals joined by just logical OR (Vv ). 

Each of the variable can only take bipolar value of 1 or -1 which represents true or false 

respectively. Explicit definition of the 2SAT formula P,,,, 1s given by: 


y 
Pasar = AG; (1) 


i=] 


where C, is a list of clause with 2 variables each, 
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C, = (tt, 1) (2) 


i=] 


The main objective of 2SAT representation 1s to find the consistent interpretation that make formula 
Py 5,7 become satisfied [20]. 


3. ANT COLONY OPTIMIZATION 

Ant Colony Optimization (ACO) is simulated by the behaviour of foraging of real ants [8-21]. 
Real ants traverse the space surrounding their nest in random when searching for food. The ant will evaluate 
and carry back some of the food to the nest when it find a food source. When travelling back to the nest, 
the ant will leave a trace of pheromone on the ground. Density of the pheromone it deposits is decided by the 
amount and value of the food. Pheromones deposited will lead the other ants to the food source. Through the 
pheromone trails, the ants are able to find short paths from their nest to the food source [22, 23]. 

ACO algorithm consists of artificial ants (agents) with distinctive functions and structures. 
The agents work with each other to accomplish a potential unified behaviour for the system as a whole, 
creating a vigorous system that has the ability to find high quality solutions for problems with a huge search 
space. Dorigo and Di Caro [24] proposed this system as a metaheuristic to solve COPs. This metaheuristic 
has been proven to be vigorous and flexible since it has been put into use successfully to different COPs [25]. 
The ACO algorithm in learning 2SAT is inspired by Changdar et al. [26]: 
Step |: Initialization. Initialize the state of the bitstring, S, where S,(t) €[1,—1]. 


Step 2: Fitness evaluation. Calculate the fitness of S,(t) by using the following equation [27]: 
NC 
FSO =>d.¢C, (3) 
i=l 


where NC is the number of clause in 2SAT and C, is given as follows 


fl True 
lo Fals 
aise (4) 


Step 3: Pheromone density initialization. Pheromone for each value of candidate group | or -1 is 
represented by a real vector 7;,(1) = (Ziy.Zia,---.Ziy) and 7; (-l) = iy, Tz...) where each Tj, is a 
random number between [0,1], 7 =1,2,.....V3 7 =1, 2,...... «U4 
Step 4: Visibility density initialization. Visibility density for each value of candidate group 1 or -1 is 
represented by a real vector 7); (1) = (Hastie: osm ee) and 1; (-1) = (78M: wipes a) where each 7) 1s a 
random number between [0,1], 7 =1,2,......V; 7 =1,2,......, U. 


Step 5: Ant’s searching phase. The movement probability of the ant “k’ (k =1,2,.......M) is defined 
as follow: 


OY [yy 
7; (-1)] [ny (-1) |" + [T, (1) [ny (1) |" 


where @ (@ > (0) is the relative importance of the pheromone and @ (f > (0) is the relative importance of 
the visibility of the ants. Hence the complementary of the movement is written as follows: 


pi (—1) = (5) 


pj; (1) =1- p; (-1) (6) 


k . a "4 66899 66999 : 
where Pj; 1s the probability of movement from the bit “7” to the state “7” at time t. 


Step 6: Evaporation. The decrement of pheromone is based on the following equation: 
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Ty (-I)(e1)=(0- 0) (-1)() +a ” 

Ty (I(t+1)= (1-2) (Y(t) + ar (8) 

AT est _ 1 9 
U f (s?e* ( ) 


where / is the coefficient representing evaporation rate and P«€ 10, I], he Ge ) 1s the number of clause 


for 2SAT. 
Step 7: Refinement. Calculate the fitness of the solution f(S;) by using (3). The optimal solution found so 


far will be recorded as s; (¢+1). If Ss (t+1) is superior than s; (t), then s; (t) is replaced by s; (t+1). 
Pheromone density will be updated by using (5). If f(S,) # NC , repeat Step 4 to Step 8 until pre-determined 
trial, Trial is achieved. 


4. LOGIC PROGRAMMING IN HOPFIELD NEURAL NETWORK 

Neural network is able to model complex relationships between inputs and outputs also to look for 
patterns in data. Pattern recognition and function estimation are the reasons why neural networks are utilized 
in data mining [28]. Several other reseachers [29-31] have also employed neural network in extracting critical 
relationship among the data. HNN is one of the most commonly used neural network models. It is a simple 
neural network model that has feedback connections. HNN systematically stores patterns as a content 
addressable memory (CAM) [32]. HNN is a network of N interconnected neurons where the output and input 


of each neuron is connected. The connection weight from neuron 7 to j is denoted by w,. In HNN, w, =w,, 
(symmetric networks) and w, =w,, =O (no self-feedback connections). Let S; be the state or output of the i 


th unit, @ is the pre-defined threshold of unit 7. For bipolar networks, S, is either +1 or -1. General updating 
rule in HNN is given by: 


1 if \ WS; > 
J 


i = (10) 
—1 Otherwise 

The local field of the network is given by: 

h(t)= DowyS, +0 (11) 

| 

The updating rule will be: 

S,(t+1)= sgn| h, (t) | (12) 
The final state of neurons will be examined by using Lyapunov or energy function: 

Doe = LDH ss; mS (13) 


The final energy of HNN is always decreasing with the dynamics. 2SAT in HNN is abbreviated as 
HNN-2SAT model. 
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5. 2SATISFIABILITY BASED REVERSE ANALYSIS METHOD (2SATRA) 

Logic mining will execute efficiently if the most favourable HNN-2SAT model is used. The neurons 
(attributes) 1s represented in bipolar form {-1,1}. By acquiring the synaptic weight between 2 neurons, 
2SATRA might be able to reveal the level of their connectedness. Therefore, Wan Abdullah’s method [15] is 
utilized in the learning phase of 2SATRA to figure out the accurate synaptic weight between the two neurons. 
By considering both neurons C and D where S, €{-1,1} and S, €{-1,1} , the possible 2SAT clause with its 


corresponding synaptic weight are summarized in Table 1. 


Table 1. Possible 2SAT Logic with Its Corresponding Synaptic Weight 
Synaptic Weight B=CVD BeSCvD B=CyvaD £,>5CVaD 


W. 0.25 -0.25 0.25 -0.25 
Ww, 0.25 0.25 0.25 -0.25 
Wep -0.25 0.25 -0.25 -0.25 


As an example, given that neuron C and D shows | and -1 respectively, P, will be selected as the 


clause representation of the data set. In accordance with the nature of the neuron, 2SATRA will convert all 
the data sets into 2SAT logic. The Figure 1 shows the implementation of 2SATRA. 


Given binary learning and testing data set with outcome P. and P. , convert all binary data set to 


learn test 


bipolar representation where 0 denotes as -1 and | remains | 


Initialize synaptic weight of the neurons and assign all neurons with bipolar data obtained from step | 


Segregate the collection of two neurons per clauses C,,C,,C,,.......C, that lead to P, =1 





Obtain P._. by comparing the frequency of the 2SAT clauses in the overall learning data set 


Check clauses satisfaction of P.. 


Derive the synaptic weight of P.. by using WA method in Table | and randomize the state of the 
neurons 


Apply Sathasivam relaxation method to the network 


. ~ ~ + ~ 


Find the final state of neuron by computing the corresponding local field by using Equation [11] 





Induce all possible 2SAT logic 2’, P?, P’,......,P® from the neuron states 


Examine all the induced logic P*® by comparing the outcome of P* with P.. 





Calculate the root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage 
error (MAPE) and CPU time 


Figure 1. Algorithm of implementation of 2SATRA 


In learning data set, {Win / Draw, Lose} will be converted into bipolar representation {1,-1} 
respectively. Each football club will be represented in terms of neuron in 2SATRA. Hence, there will be a 
total of six neurons being considered in this data set. The respective football club and neuron is summarized 
in Table 2. 
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Table 2. Respective Football Club and Neuron 


Neuron EPL SSL FLI1 
A Arsenal (Ar) Barcelona (Ba) PSG (Psg) 
B Chelsea (Ch) Sevilla (Se) Lyon (Ly) 
C Liverpool (Li) Real Madrid (Rm) Monaco (Mo) 
D Man. City (Mc) Atletico Madrid (Am) Nantes (Na) 
E Man. United (Mu) Valencia (Va) Marseille (Ma) 
F Tottenham Hotspurs (Sp) Villareal (Vi) Toulouse (To) 


In this paper, HNN will be incorporated with ACO in doing 2SAT based Reverse Analysis method 
(HNN-2SATACO). The proposed model will be compared with the existing model, HNN-2SATES [33]. 
All outputs that exceed the threshold CPU time, which is 24 hours will be excluded. Both HNN-2SAT 
models will be implemented in Dev C++ Version 5.11 on a computer equipped with Intel Core 17 2.5GHz 
processor and 8GB RAM using Windows 8.1. All HNN-2SAT program executions run 100 trials with 100 
combination of neurons to reduce statistical error [34]. 


6 PERFORMANCE EVALUATION 

In order to evaluate the efficiency of all HNN-2SAT models, a total of four performance evaluation 
metrics, namely root mean square error, mean absolute error, mean absolute percentage error and CPU time 
will be analysed. 


6.1. Root Mean Square Error 
Root mean square error (RMSE) is normally used to compute the differences between target value 
and the actual observed value of the model. The equation for RMSE 1s defined as [35, 36] 


RMSE = 9° |—(fuc~ f.) (14) 


Where f,- is the total number of 2SAT clauses, /, is the fitness of the solution in HNN-2SAT model and n 


1 


is the number of iteration before f, = fy~. The best HNN-2SAT model will have the smallest value of 
RMSE. 


6.2. Mean Absolute Error 
Mean absolute error (MAE) is the mean of the absolute values of the errors. The error is derived 
from each difference of f,. — f,. MAE is defined by the following equation [37]. 


n 


1 
MAE = 2 fac ae 


i=l 





(15) 


Where f,- is the total number of 2SAT clauses, f, is the fitness of the solution in HNN-2SAT model and n 


1 


is the number of iteration before /, = f,.. Similar to RMSE, the least value of MAF indicates the best HNN- 
2SAT model. 


6.3. Mean Absolute Percentage Error 
Mean absolute percentage error (MAPE) is the mean of the absolute values of the errors in 
percentage terms. MAPE is a measure of accuracy in the percentage form. MAPE can be expressed as [19] 





MAPR AY 100 lfc = fil (16) 


i=l f| 





The theory of MAPE is very simple, however, it has a crucial flaw. MAPE cannot be used if the 
observed value is zero as it will lead to division by zero. The best HNN-2SAT model will have the lowest 
percentage of MAPE. 
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6.4. Computational Time 

CPU time is defined as the time required by a particular HNN-2SAT model to finish one execution. 
CPU time denotes the stability and competency of the HNN-2SAT models. Each simulations will be run on 
identical processor to cancel off the effect of bad sector and memory build-up. Equation of the CPU time is 
given by [38] 


CPU_Time = Learning_Time + Retrieval_Time (17) 


A good HNN-2SAT model will be able to lessen the computation time in the learning phase. Hence, the best 
HNN-2SAT model will have the shortest CPU time. 


7. RESULTS AND ANALYSIS 

A total of 4 performance evaluation namely RMSE, MAE, MAPE and CPU time were analysed to 
determine the effectiveness, precision and steadiness of HNN-2SAT in doing 2SATRA. NC is defined as the 
total number of clause and | clause has 2 neurons (attributes). Figure 2, Figure 3, Figure 4 and Figure 5 
showed the results of RMSE, MAE, MAPE and CPU time respectively for HNN-2SATES and HNN- 
2SATACO of all 3 leagues. In this execution, 92 data points have been embedded to 2SATRA as learning 
data and 60 as testing data. 6 clubs of each league were chosen and learning phase for all HNN models will 
be conducted with different number of NC. 
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Figure 2. RMSE for HNN-2SAT models Figure 3. MAE for HNN-2SAT models 
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Figure 4. MAPE for HNN-2SAT models Figure 5. CPU Time for HNN-2SAT models 


Based on Figure 2, Figure 3 and Figure 4, HNN-2SATACO had significantly lower value of RMSE, 
MAE and MAPE compared to HNN-2SATES. Learning phase of 2SATRA in HNN-2SATES was trapped in 
trial and error state and lead to RMSE, MAE and MAPE accumulation. On the contrary, the effect of 
interaction between the ants and pheromone density helped HNN-2SATACO to diversify candidate solution 
in search space. Any non-fit solution after pheromone evaporation will be improved further by pheromone 
density initialization. Two layered optimization mechanism of pheromone initialization and pheromone 
evaporation reduced the deviation error of the network and results in minimal RMSE, MAE and MAPE. 
Additionally, the learning error for 2SATRA in both HNN-2SATACO and HNN-2SATES increased as the 
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number of clause increased. 2SATRA achieved maximum value of RMSE, MAE and MAPE when NC=10. 
As the number 2SAT inconsistencies increased, the probability of getting all satisfied clause decreased 
dramatically. This was due to the high number of iterations required to satisfy high number of clauses during 
the learning phase. Based on Figure 5, HNN-2SATACO required a short CPU time to complete a 
single execution. 

This was a result of HNN-2SATACO accentuated on fitness improvement in every iteration. 
Significantly small error also reduced the CPU time for HNN-2SATACO to complete the learning phase. 
Conjointly, at VC=8 and NC=10, HNN-2SATES was observed to have shorter computation time compared to 
HNN-2SATACO. This was due to HNN-2SATES required less iteration before the network reached 
relaxation phase. Sathasivam relaxation method [38] was able to lessen neuron oscillation that would prolong 


the CPU time and hence achieve sub-optimal induced logic, Pe . During learning phase, Pe induced by 


2SATRA managed to accomplish an accuracy of 88% (EPL), 90% (SLL) and 91% (FL1). This was because 
the character of neuron in HNN, rather than oscillating, the neurons had always converged to minimum 
energy. The results in this paper was not being compared to other existing methods because the approaches 
were different and incomparable. For example, Huang and Chang [39] managed to achieve an accuracy of 
76.9%. However, the research was done on predicting the winner and loser of the match while this paper was 
considering the relationship among the clubs in the league. The best induced logic, P,,, and inconsistent 


est 


interpretation, P 


inconsistent 


for each football league are summarized in Table 3. 


Table 3. Best Induced Logic and Inconsistent Interpretation 


League Best induced logic, P,, Inconsistent interpretation, P 
EPL (Ar v Ch) A (Liv Mc) A (Muv Sp) (AAr ATCh) v (ALi AAMC) v (=Mu A —=Sp) 
SLL (Bav Se) \(Rmv Am) a (Vav Vi) (=Ba A Se) v (—Rm A AAm)v (=Va A AVI) 
FLI (Psg v Ly) \(Mowv Na) A (Mav To) (APsg A=Ly)v (—AMo A ANa) v (AMa A aTo) 


According to Table 3, the relationship among the football clubs is shown. A list of key findings are 
summarized in Table 4. 


Table 4. Key Findings from Induced Logic 
League Key Findings 

EPL In any match week, if Arsenal, Liverpool and Manchester United lost their matches, the rest of the clubs such as 
Chelsea, Manchester City and Tottenham Hotspurs will have more player options during that week. With that 
advantage, club such as Chelsea has the privilege to send their second best team. The implication of the logical rule 
gives more training time to Chelsea’s first team and focus on more important matches. This will reduce the number of 
injuries faced by the club. 

SLL When Barcelona, Real Madrid and Valencia won or draw in their matches in any match week, Sevilla, Atletico Madrid 
and Villareal will have to send their best team to try and secure a win. As the logical rule implies that Sevilla, Atletico 
Madrid and Villareal might lose if Barcelona, Real Madrid and Valencia won or draw in a certain match week. For 
example, the best players of Sevilla will have to be in great condition and concentrate on the match that week. 

FL1 If clubs like Lyon, Monaco and Toulouse lost in a certain match week, the opponents of PSG, Monaco and Marseille 
could take advantage of it by sending their best players knowing that there’s a possibility for club like PSG to send their 
second best team for that week. The logical rule gives precious information not only to the 6 clubs included in this 
research but also to their opponents in any match week. 


The results has shown that 2SATRA has decent potential to obtain logical rule that classifies the 
results of win draw or lose for a football match. The induced logic can help the football club managers and 
coaches in deciding the strategies and players that they should play in a certain matchup. Football analysts 
could also use the induced logic to provide expert discussion during a football game. 


8. CONCLUSION 

In this research, 2SATRA is shown to be a brilliant relationship extraction system to model the 
results of football matches. The effectiveness of 2SATRA in doing logic mining is examined by using 3 data 
sets from 3 different leagues. The results acquired shows that 2SATRA has decent potential to obtain optimal 
logic from learned data set. ACO also outperforms ES in extracting the relationship among the football clubs 
in the league. Future research could be done by utilizing other logical rule such as randomized KSAT where k 
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>2 or integrating other metaheuristic algorithm such as Particle Swarm Optimization and Artificial Immune 
System to accelerate the process of learning phase of 2SATRA. 
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